Export Job Task

This page contains these elements:

Note: You can click Variables to insert a predefined variable into a selected field on this tab.

Details Tab

This tab contains these elements:

  • Source Path – HDFS source path for the export that contains the source data.

  • Destination Table – Click the table you want to use to populate in the database.

  • Existing Jar File – Enter the name of the jar to export the record class from.

  • Allow Updates – Depending on the target database, you can update rows if they exist in the database already or insert rows if they do not exist yet. By default, the export is carried out using INSERT statements. By checking the Allow Updates option, you can choose to use UPDATE statements or INSERT and UPDATE statements together. The list box next to the Allow Updates option allows you to choose the desired update mode.

  • Key Columns for Updates – Anchor columns to use for updates. Use a comma separated list of columns if there is more than one column. This is a mandatory field if Allow Updates is selected.

  • Use Batch Mode – Use batch mode for underlying statement execution.

  • Use High Performance Direct Export – Choose to use the direct export fast path.

    MySQL provides a direct mode for exports as well, using the mysqlimport tool. This may be higher-performance than the standard JDBC codepath. When using export in direct mode with MySQL, the MySQL bulk utility mysqlimport should be available in the shell path of the task process.

  • Number of Mappers – Choose the number of map tasks to export in parallel.

  • Staging Table – The table in which data will be staged before being inserted into the destination table. Support for staging data prior to pushing it into the destination table is not available for high performance direct exports. It is also not available when export is invoked using key columns for updating existing data.

  • Clear Staging Table – Indicates that any data present in the staging table can be deleted.

Other Tab

Click this tab to define output formatting and input parsing elements and code generation related parameters.

This tab contains these elements:

Note: You can click Variables to insert a predefined variable into a selected field on this tab.

In the Output formatting section, enter:

  • Use default delimiters – Choose this option to use MySQL's default delimiter set.

    Example: fields: , lines: \n escaped-by: \ optionally-enclosed-by: '

  • Column delimiters – Sets the field separator character.

  • Line delimiters – Sets the end-of-line character.

  • Enclosed by – Sets a required field enclosing character.

  • Escaped by – Sets the escape character.

  • Optionally enclosed by – Sets a field enclosing character.

In the Input Parsing section, enter:

  • Column delimiters – Sets the input field separator.

  • Line Delimiters – Sets the input end-of-line character.

  • Enclosed by – Sets a required field encloser.

  • Escaped by – Sets the input escape character.

  • Optionally enclosed by – Sets a field enclosing character.

In the CodeGen section, enter:

  • Binary Output Directory – Output directory for compiled objects.

  • Class Name – Sets the generated class name. This overrides package-name.

  • Source Output Directory – Output directory for generated code.

  • Package Name – Put auto-generated classes in this package.